Some Recent Advances in Speech Recognition with Potential Applications in Other Statistical Pattern Recognition Areas

نویسنده

  • Hervé Bourlard
چکیده

In this talk, we will review some recent developments in the area of statistical speech recognition, and which could also be potentially useful to other statistical pattern recognition applications. Among other issues, we will discuss the use of new forms of expert mixtures, for example based on the minimization of the product of error probabilities. This rule, sometimes referred to as “produt-of-errors rule” has recently been used quite successfully in multi-channel (multi-modal) processing. In speech recognition, this rule was also used to implement automatically noise robust speech recognition approaches (based on frequency subband processing), which do not require noise adaptation or explicit noise models. In a related framework, we will also introduce the theory of “missing data”, yielding significantly improved noise robustness in the case of classification of multi-dimensional feature vectors prone to noise in some (unknown) components. Finally, as a further generalization, we will also discuss a new hidden Markov model (HMM), referred to as HMM2, where the HMM emission probabilities are themselves estimated state-dependent (feature based, secondary) HMMs. Proceedings of the 16 th International Conference on Pattern Recognition (ICPR’02) 1051-4651/02 $17.00 © 2002 IEEE

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Automatic Face Recognition via Local Directional Patterns

Automatic facial recognition has many potential applications in different areas of humancomputer interaction. However, they are not yet fully realized due to the lack of an effectivefacial feature descriptor. In this paper, we present a new appearance based feature descriptor,the local directional pattern (LDP), to represent facial geometry and analyze its performance inrecognition. An LDP feat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002